Systems and Methods for Motion-Vector-Aided Video Interpolation Using Real-Time Smooth Video Playback Speed Variation

ABSTRACT

Systems and methods for encoding and playing back video at adjustable playback speeds by interpolating frames to achieve smooth playback in accordance with embodiments of the invention are described. One embodiment includes a source encoder that includes a processor, memory including an encoder application, where the encoder application directs the processor to: select a subset of frames from a first video sequence; generate motion vectors describing frames from the first video sequence that are not part of the selected subset of frames, where each motion vector describes movement between a frame in the subset of frames and a frame not included in the subset of frames; store the motion vectors; decimate frames not included in the subset of frames from the first video sequence to generate a second video sequence having a nominal frame rate less than the frame rate of the first video sequence; and encode the second video sequence at the nominal frame rate.

CROSS-REFERENCE TO RELATED APPLICATIONS

The current application is a continuation of U.S. patent applicationSer. No. 14/843,782, entitled “Systems and Methods forMotion-Vector-Aided Video Interpolation Using Real-Time Smooth VideoPlayback Speed Variation” to Espeset et al., filed Sep. 2, 2015, whichis a continuation of U.S. patent application Ser. No. 14/503,029,entitled “Systems and Methods for Motion-Vector-Aided VideoInterpolation Using Real-Time Smooth Video Playback Speed Variation” toEspeset et al., filed Sep. 30, 2014 and issued on Sep. 5, 2015 as U.S.Pat. No. 9,131,202, which application claims priority to U.S.Provisional Application No. 62/005,608, “Systems and Methods forMotion-Vector-Aided Video Interpolation for Reduced Video File Size AndReal-Time Smooth Video Playback Speed Variation” to Hardy et al., filedMar. 30, 2014, the disclosures of which are incorporated herein byreference.

FIELD OF THE INVENTION

The present invention generally relates to playback of video content andmore specifically to enabling a playback device to vary the playbackspeed of the video in real time while maintaining a frame rate at eachdifferent playback speed by interpolating additional frames in realtime.

BACKGROUND

Distributing video media from content providers to content consumersoften requires transmitting large amounts of data across a network.These large videos are often a challenge to process, store, andtransmit. In order to allow for the playback of video content ondifferent devices with various processing capabilities, many contentproviders store different versions of the same video content encoded atdifferent resolutions and/or maximum bitrates (e.g., High Definition,and Standard Definition) and are thus able to distribute differentencodings of a piece of video content to a playback device based onprevailing network conditions. Network conditions are especiallyimportant when streaming video content in real-time since networkdeterioration may result in a stuttering effect of the video beingplayed back on the playback device. The size of an original piece ofvideo content may be reduced by re-encoding the video using differentencoding parameters such as (but not limited to) picture resolution(e.g., 720p, 1080p, 4 k, etc.), frame rate (i.e., 24, 30, 48, 60 framesper second, etc.), bitrate (e.g., 12 Mbps, 40 Mbps, etc.), frame size,color depth, among various other characteristics of the video.

Most standard video playback has historically been delivered at a rangeof between 24 and 30 frames per second. However, many of today's videogames, as well as an increasing portion of TV sets are able to rendervideo at higher rates, and often up to 60 frames per second and beyond.This increased frame rate can provide the user with a smoother, morefluid viewing experience and is generally considered to resemble realworld movement. However, distributing video at such frame rates oftenrequires a significant amount of bandwidth capacity being made availableon the network.

Furthermore, media distributed by content providers to content consumersis often encoded using a variety of video compression standards thatfacilitate the distribution of the content across a network. Well knowncompression standards include H.264/MPEG-4 AVC, published by the ITU-TVideo Coding Experts Group (VCEG) together with the ISO/IEC JTC1 MovingPicture Experts Group (MPEG), and the newer High Efficiency Video Coding(HEVC) standard, MPEG-H Part 2, developed by MPEG and VCEG, both ofwhich are herein incorporated by reference. Video compression or videoencoding typically involves compressing the amount of information usedto describe the frames of video in a video sequence in order to reducethe consumption of network resources when distributing content. Mediaplayback devices may include a video decoder used to decode an encodedvideo prior to playback on the device. However, video decoded from anencoded video elementary bit stream is often limited to playback at theparticular encoding profile at which the video was initially encoded. Inparticular, the frame rate used to play back a video is typicallydetermined based on the particular encoding profile used to encode thevideo and varying the playback speed of a video may effect the viewingquality of the video.

SUMMARY OF THE INVENTION

Many different applications can benefit from the ability to smoothlyvary the playback speed of a video sequence by interpolating additionalframes between frames in the sequence in real time, including videogames, surveillance, video editing, among various others. However,currently reducing the speed of video in real-time, if the speed isreduced sufficiently and the frame rate of the original video file isnot high enough, produces a stuttering effect whereby the motion doesnot look smooth and natural to a viewer. Generally, this becomesobservable to the human eye when the playback frame rates are reducedbelow 18-20 images per second. Likewise, if speed is increased, theremay be cases when a frame required for smooth motion is missing in theframes of the video sequence, and a closest frame may be repeated fordisplay. This may also produce a small visual stutter. Severalembodiments of the invention allow for playback at adjustable playbackspeeds by interpolating the necessary number of frames to playback theencoded video at a desired frame rate that provides a smooth viewingquality. In particular, since the motion data are stored as vectors, newframes may be interpolated for any particular time interval between aseries of frames and thus the playback device is able to playback thevideo at any desired playback speed while retaining a smooth videoviewing experience.

There can also be a significant benefit to distributing video contentencoded at a reduced size yet providing the same level of image qualityas the original video encoding. However, re-encoding video to reduce thesize of the encoded video content, when the original video content hasalready been efficiently encoded, generally results in a reduction ofthe quality of the video played back by a playback device. Manyembodiments of the invention are able to encode video by decreasing thenumber of frames and inserting motion vectors that can be used tointerpolate the deleted frames during playback.

Systems and methods in accordance with embodiments of the inventionencode and play back video at adjustable playback speeds byinterpolating frames to achieve smooth playback. One embodiment includesa source encoder that includes a processor, memory including an encoderapplication, where the encoder application directs the processor to:select a subset of frames from a first video sequence; generate motionvectors describing frames from the first video sequence that are notpart of the selected subset of frames, where each motion vectordescribes movement between a frame in the subset of frames and a framenot included in the subset of frames; store the motion vectors; decimateframes not included in the subset of frames from the first videosequence to generate a second video sequence having a nominal frame rateless than the frame rate of the first video sequence; and encode thesecond video sequence at the nominal frame rate.

In a further embodiment, the encoder application directs the processorto store the motion vectors encoded as pixels within frames in thesubset of frames.

In another embodiment, a motion vector includes an angle component and amagnitude component, where the angle component is stored within a huevalue of a pixel and the magnitude component is stored within abrightness value of a pixel.

In a still further embodiment, the subset of pixels are located within aportion of the frame that is not displayed during playback of thedecoded frame.

In yet another embodiment, the encoder application configures theprocessor to encode the new video sequence as a sequence of intra andinter frames.

In yet another embodiment, each motion vector describes the movement ofat least one pixel between a frame and at least one subsequent frame inthe first video sequence.

In a further embodiment, the motion vectors are stored in a separatefile, and each motion vector corresponds to a particular frame in thesubset of frames.

In a still further embodiment, the encoder application configures theprocessor to: compute a motion vector for each pixel in a frame in thesubset of frames; and compress motion vectors of a subset of neighboringpixels in the frame to generate a new motion vector.

In a still further embodiment, the encoder application configures theprocessor to: compute a motion vector for each pixel in a frame in thesubset of frames; and discard a plurality of motion vectors to generatea set of motion vectors describing a frame that is not in the subset offrames.

A further additional embodiment includes a playback device that includesa processor configured to communicate with a memory, where the memorycontains a media player application and a decoder application, where thedecoder application directs the processor to: configure a video decoderbased upon encoding parameters including a nominal frame rate; decodeencoded frames of video using the video decoder to provide decodedframes of video at the nominal frame rate; where the media playerapplication directs the processor to: configure a video decoder todecode encoded frames of video by providing the decoder application withencoding parameters including a nominal frame rate; extract a set ofmotion vectors describing a frame of video that can be interpolatedbased upon at least one decoded frame of video received from the videodecoder; and interpolate at least one interpolated frame using the setof motion vectors and the at least one frame of decoded video; andplayback a sequence of frames of video including at least one decodedframe of video and at least one interpolated frame of video.

In a further embodiment, the media player application directs theprocessor to extract a set of motion vectors describing at least oneinterpolated frame of video from a subset of pixels in at least onedecoded frame of video.

In another embodiment, the media player application directs theprocessor to analyze a hue value of a pixel in a decoded frame of videoto determine an angle of a motion vector and analyze a brightness valueof a pixel to determine a magnitude of a motion vector.

In a still further embodiment, the media player application directs theprocessor to: determine a playback speed for the decoded video; andinterpolate a number of new frames based on the playback speed and thenominal frame rate of the encoded video.

In a yet further embodiment, the media player application directs theprocessor to compute motion vectors describing each pixel in aninterpolated frame using the set of motion vectors and bilinearfiltering.

In yet another embodiment, the media player application directs theprocessor to playback a sequence of frames comprising at least onedecoded frame followed by at least one interpolated frame.

In a further embodiment, the media player application directs theprocessor to extract a set of motion vectors describing at least oneinterpolated frame from red, green, blue pixel values of at least onepixel in at least one decoded frame.

In another embodiment, the media player application directs theprocessor to extract the set of motion vectors describing a frame ofvideo that can be interpolated based upon at least one decoded frame ofvideo received from the video decoder from pixels located in a subset ofrows of the at least one decoded frame.

In still another embodiment, media player application directs theprocessor to playback the portion of the at least one decoded frame thatdoes not contain rows of pixels from which motion vectors are extracted.

In a further embodiment, the set of motion vectors are extracted from aseparate file.

In still a further embodiment, the media player application configuresthe processor to: determine a change in the playback speed of thedecoded video; and interpolate a number of frames between decoded framesof video to maintain a frame rate of the decoded video at the changedplayback speed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a network diagram of a system for distributing video encodedwith motion vectors in accordance with an embodiment of the invention.

FIG. 2 conceptually illustrates a server configured to encode video withmotion vectors in accordance with an embodiment of the invention.

FIG. 3 conceptually illustrates a playback device configured to playback video encoded with motion vectors in accordance with an embodimentof the invention.

FIG. 4 is a flow chart illustrating a process for encoding and decodingvideo at a frame rate that is higher than the frame rate of the encodedvideo stream by interpolating the encoded frames using motion vectors inaccordance with an embodiment of the invention.

FIG. 5 is a flow chart illustrating a process for encoding motionvectors as additional pixels in a video frame in accordance with anembodiment of the invention.

FIG. 6 is a flow chart illustrating a process for using motion vectorsto interpolate additional frames from the decoded frames of a videosequence in accordance with an embodiment of the invention.

DETAILED DESCRIPTION

Turning now to the drawings, systems and methods for encoding andplaying back video at adjustable playback speeds by interpolating framesto achieve smooth playback in accordance with embodiments of theinvention are illustrated. In a number of embodiments, re-encoding videoat a reduced frame rate may reduce the size of an encoded videosequence. To reduce the number of frames in a video encoding and yetmaintain the quality of the video, many embodiments compute and storemotion vectors describing the frames of the originally encoded videocontent that are deleted during the re-encoding of the video at thelower frame rate. These motion vectors are stored in such a way thatthey can be accessed following the decoding of the re-encoded videosequence. In this way, the motion vectors may be used to interpolate thedeleted frames during playback to enable playback of the video sequenceat the original frame rate, despite the video sequence being encoded atthe lower frame rate. Furthermore, a playback device may use the motionvectors to maintain the visual quality of the video content when varyingplayback speed in real-time. Many embodiments maintain visual quality ofvideo content at different playback speeds by interpolating additionalframes of video as needed to maintain a particular frame rate at theparticular playback speed. In particular, since the motion data ofpixels between a series of frames may be stored as vectors with adirection and length (i.e., magnitude) component, new frames may beinterpolated between a series of frames for any particular time interval(i.e., playback speed) by computing an estimate of the amount a pixel islikely to move in the direction of the motion vector during a specifictime interval. The motion vectors provide the positions of movingobjects at times between a first and second frame and thus one or moreinterpolated frames inserted between the first and second frames canshow the objects at the interpolated positions indicated by the motionvectors.

In order to generate the motion vectors, many embodiments analyze dataregarding how pixels move between frames and save the movement data asmotion vectors. In several embodiments, the motion vectors may beembedded in the video frames that are encoded to create a reduced framerate video sequence. In other embodiments, the motion vectors may bestored in a separate location within the video elementary bitstream, aseparate location within a container file, and/or within a separate filethat accompanies the encoded video. In several embodiments, the motionvectors are encoded to require less storage space than the video framesdeleted from the original sequence of (encoded) video frames thusreducing the size of the original encoded video. Accordingly, thereduced size of the encoded video may reduce the storage space utilizedwhen storing the encoded video. Furthermore, the reduced size of theencoded video may facilitate the distribution and subsequent processingof the encoded video across networks. In particular, the reduced size ofthe encoded video may reduce the amount of network bandwidth utilizedwhen distributing the encoded video.

In several embodiments, the motion vectors provide a direction anddistance that one or a collection of pixels relative to their locationsin one or more video frames. In many embodiments, the motion vectors maydescribe the forward movement of pixels through subsequent frames of avideo sequence relative to a particular reference frame. In someembodiments, the motion vectors may also describe the backwards movementof pixels through preceding frames of a video relative to the particularreference frame. In certain embodiments, a motion vector is stored usingCartesian coordinates (i.e., (x,y) movement) that provide a numericoffset of a number of pixels by which a particular pixel moves betweenframes. Numerous embodiments may define the motion vectors using polarcoordinates (i.e., distance, angle). In several embodiments, the motionvectors may be encoded as additional pixels within a frame. Severalembodiments may use hue-saturation-lightness (HSL) orhue-saturation-brightness (HSB) representations in a red, green, blue(RGB) color model to encode motion vectors. In particular, in someembodiments that use polar coordinates, an angle component of a motionvector may be stored as a hue value of a pixel and the length componentof the motion vector may be stored as the brightness value of the pixel.Using hue and brightness values of a pixel to store the motion vectorsmay provide for an increased level of precision regarding the movementof the pixels relative to the motion vectors used in encoding interframes (i.e. frames encoded by reference to one or more frames in thevideo sequence). Furthermore, the hue and brightness values of a pixelare generally subject to less loss of information during subsequentencodings that may be applied to a video sequence in comparison to othercolor values of a pixel such as the red, green or blue component values.

In order to allow for real time decoding of the color information toextract motion vectors and perform interpolation of one or more framesof video, some embodiments use a graphics processing unit (GPU) within aplayback device capable of performing many operations in parallel. Theterm GPU is generally used to describe a class of electronic circuitsdesigned to rapidly manipulate and alter memory to accelerate thecreation of images in a frame buffer intended for output to a display.Many GPUs exploit a highly parallel structure, which makes them moreeffective than general-purpose processors for algorithms whereprocessing of large blocks of data can be performed in parallel. Severalembodiments utilize lookup tables that output a hue angle based upon thered, green and blue color component values of a pixel. This can providea dramatically increased frame rate with negligible overhead.

Systems and methods for systems and methods for encoding and playingback video at adjustable playback speeds by interpolating frames toachieve smooth playback in accordance with embodiments of the inventionare discussed further below.

System Architecture for Distributing Video with Adjustable Playback

A system for encoding and playing back video at adjustable playbackspeeds by interpolating frames to achieve smooth playback interpolatingthe encoded frames using motion vector data in accordance with anembodiment of the invention is illustrated in FIG. 1. The system 100includes a source encoder 102 configured to encode original videocontent into encoded video. In many embodiments, the source encoder maybe used to reduce the size of the encoded video content relative to theoriginal video content, which may itself have already been efficientlyencoded at a higher nominal frame rate. In particular, in severalembodiments the source encoder may reduce the size of the encoded videoby reducing the number of frames in the video, which thereby reduces thenominal frame rate of the video. The term “nominal frame rate” is usedhere to indicate the rate at which the video is decoded by a videodecoder. As is discussed further below, the encoding of the videocontemplates playback at a higher frame rate than the nominal frame rateby the interpolation of decoded frames and/or playback at differentplayback speeds at the nominal or a specified frame rate byinterpolating frames. Therefore, the nominal frame rate indicates thenumber of encoded frames decoded during a specified time interval by avideo decoder and is typically specified in a header of a video filecontaining the encoded video. In certain embodiments, the source encodermay reduce the size of an encoded video by modifying othercharacteristics of the video, including (but not limited to) pictureresolution, frame size, bitrate, and/or color encoding. In order toreduce the size of an encoded video by reducing the frame rate, thesource encoder in many embodiments may compute and store motion vectorsdescribing the movement of pixels between frames of the source videosequence and delete frames from the source video sequence described bythe motion vectors.

The source encoder 102 may store different versions of the same videocontent within the media source storage 103. A version of a piece ofvideo content may include video encoded at a certain encoding profilethat specifies a particular set of encoding parameters used to encodethe video. The encoding parameters may specify, for example, the nominalframe rate, picture resolution, frame size, and/or bitrate used toencode the video. The source encoder may re-encode the video content ata reduced nominal frame rate, and including computed motion vectors inorder to further facilitate for the distribution of the encoded videoacross a network. Well known compression standards that can be used toencode the sequence of frames contained within the re-encoded videocontent can include, among various other standards, the H.264/MPEG-4 AVCand the newer HEVC standard. The generation of motion vectors inaccordance with various embodiments of the invention are discussedfurther below.

In the illustrated embodiment, the source encoder is a server includingone or more processors directed by an encoding software application. Inother embodiments, the source encoder can be any processing deviceincluding a processor and sufficient resources to perform thetranscoding of source media including (but not limited to) video, audio,and/or subtitles. In some embodiments, the encoded video is thenuploaded to a distribution server 104. In many embodiments, sourceencoder uploads the encoded video.

In a number of embodiments, the distribution server 104 distributes theencoded video to one or more playback devices 105-107 using one or moredistribution channels. The distribution server may distribute theencoded video to different playback devices requesting video. In manyembodiments, the distribution server receives and processes downloadrequests from a variety of playback devices that seek to download theencoded video. When the distribution server receives a download requestfrom a playback device, it can provide the playback device with accessto download the encoded video. The encoded video may include motionvectors that the playback device can use to interpolate additionalframes. A distribution server 104 can also push video content encoded inaccordance with embodiments of the invention to playback devices.

In some embodiments, the distribution server receives requests to streamvideo content from a variety of playback devices and subsequentlystreams the encoded video to the playback devices for playback. Inseveral embodiments, the variety of playback devices can use HTTP oranother appropriate stateless protocol to request streams via a network108 such as the Internet. In several embodiments, a variety of playbackdevices can use RTSP whereby the distribution server records the stateof each playback device and determines the video to stream based uponinstructions received from the playback devices and stored datadescribing the state of the playback device.

During the playback of video encoded in accordance with variousembodiment of the invention, a playback device may initially decodeencoded frames of video contained within an elementary bitstream byconfiguring a decoder based upon encoding parameters provided to theplayback device describing the encoding of the elementary bitstream. Inseveral embodiments, the encoding parameters specify characteristics ofthe encoded video including (but not limited to) a resolution for theencoded frames of video and a nominal frame rate for the encoded framesof video. Video encoding techniques, such as (but not limited to) blockbased encoding techniques, can encode blocks of pixels within a frame ofvideo by referencing a corresponding block of pixels in a previousand/or subsequent frame of video. The references are typically referredto as motion vectors, because they specify the movement of the block ofpixels between frames. During the decoding process, a video decoder on aplayback device can use motion vector data within an encoded frame ofvideo to decode the frame of video based upon other decoded frames inthe video sequence.

In several embodiments, additional encoding parameters describecharacteristics of motion vectors that can be used to interpolateadditional frames using the decoded frames from the elementarybitstream. A playback device in accordance with many embodiments of theinvention can utilize the additional encoding parameters describingmotion vectors that can be used to interpolate additional frames toobtain an additional set of motion vector data, either embedded withinthe decoded frames of video (i.e. recovered after the encoded videosequence is decoded) or stored separately. The additional set of motionvector data can be used to interpolate additional frames of videobetween frames from the decoded video sequence as needed based on adesired playback speed and/or frame rate.

In the illustrated embodiment, playback devices include personalcomputers 105-106 and mobile phones 107. In other embodiments, playbackdevices can include consumer electronics devices such as DVD players,Blu-ray players, televisions, set top boxes, video game consoles,tablets, and other devices that are capable of connecting to a serverand playing back encoded video. Although a specific architecture isshown in FIG. 1, any of a variety of architectures can be utilized thatenable playback devices to request video encoded with motion vectorsthat may be used to interpolate additional frames as necessary forplayback of video at different frame rates as appropriate to therequirements of specific applications in accordance with embodiments ofthe invention.

The basic architecture of a source encoder in accordance with anembodiment of the invention is illustrated in FIG. 2. The source encoder200 includes a processor 210 in communication with non-volatile memory230, volatile memory 220, and a network interface 240. In theillustrated embodiment, the non-volatile memory 220 includes a videoencoder 232 that configures the processor to encode video. In someembodiments, the video encoder may also reduce the size of an originalpiece of encoded video content by reducing the number of frames in thevideo sequence and generating motion vectors that may be used tointerpolate the deleted frames during playback. In several embodiments,the network interface 240 may be in communication with the processor210, the volatile memory 220, and/or the non-volatile memory 230.Although a specific source encoder architecture is illustrated in FIG.2, any of a variety of architectures including architectures where thevideo encoder is located on disk or some other form of storage and isloaded into volatile memory at runtime can be utilized to implementsource encoders in accordance with embodiments of the invention.

The basic architecture of a playback device in accordance with anembodiment of the invention is illustrated in FIG. 3. The playbackdevice 300 includes a processor 310 in communication with non-volatilememory 330, volatile memory 320, and a network interface 340. Theprocessor 310 can be implemented using one or more general purposeprocessors, one or more graphics processors, one or more FPGAs, and/orone or more ASICs. In the illustrated embodiment, the non-volatilememory 320 includes a video decoder 332 that configures the processor todecode encoded video and a media player application 334 configured toobtain encoded video and deliver an elementary bitstream of encodedvideo to the video decoder. In many embodiments, the media playerapplication 334 may also extract motion vectors from the decoded videoframes returned by the video decoder 332 and interpolate additionalvideo frames using motion vector data obtained by the media playerapplication as needed for different playback speeds of the video duringplayback. As noted above, the motion vector data can be embedded in thedecoded frames and/or obtained from a variety of locations including(but not limited to) user data within the elementary bitstream, datawithin a container file containing the encoded video, and/or a separatefile obtained using a manifest that identifies the encoded video and thelocation of the motion vector data.

In several embodiments, the network interface 340 may be incommunication with the processor 310, the volatile memory 320, and/orthe non-volatile memory 330. Although a specific playback devicearchitecture is illustrated in FIG. 3, any of a variety of architecturesincluding architectures where the applications are located on disk orsome other form of storage and is loaded into volatile memory at runtimecan be utilized to implement playback devices in accordance withembodiments of the invention.

Encoding Interpolation Frames Using Motion Vectors

As described above, many embodiments of the invention are able to reducethe size of an original piece of encoded video by reducing the number offrames in the video sequence and storing motion vectors that may be usedto interpolate the deleted frames. Furthermore, these motion vectors maybe used during playback to modify the playback speed of the video andyet maintain a smooth video viewing experience. A process for encodingvideo at a nominal frame rate and generating motion vectors that can beutilized to interpolate additional frames in accordance with embodimentsof the invention is illustrated in FIG. 4.

The process 400 commences by selecting (405) a subset of frames from anoriginal video sequence. In some embodiments, the process selects thesubset of frames to include in a new video sequence by selecting eachN^(th) frame in the sequence of frames, thereby reducing the number offrames in the new video sequence. For example, if the process selectsevery other frame in a sequence that includes an even number of totalframes Z, starting from N (i.e., N, N+2, . . . Z), the process may beable to reduce the number of frames in the new video sequence to ½ thenumber of frames in the original video sequence (i.e., Z/2). Otherembodiments may select frames using other mechanisms. Some embodimentsmay additionally select certain types of frames in the decoded video tobe part of the subset of frames used to generate the new encoded videosequence. In particular, in MPEG video compression, frames may bedesigned as intra frames (e.g. I-frames), or inter-frames (e.g.P-frames, and B-frames) depending on different characteristics of theframe with respect to the encoding standard. An inter frame is a videoframe that does not require other frames to decode; an inter frame mayuse data from a one or more additional decoded frames in the sequence offrames to decode. In some embodiments, the process may select the intraframes in the video sequence to include in the subset of frames and onlydelete frames that are inter frames.

The process generates (410), for each of the frames in the originalvideo sequence that are not selected within the subset, motion vectorsthat describe the frames in terms of the movement of pixels in framesthat are selected as part of the subset of frames from the originalvideo sequence. In some embodiments, each pixel in a frame is describedby a motion vector. In several embodiments, the motion vectors for thepixels may be compressed. In particular, in videos that show imageswhere neighboring pixels in a frame generally move in the same directionas one another, the motion vectors may be compressed and/or a reducednumber of motion vectors may be used to describe the movement of acollection of neighboring pixels. This may be particularly useful whencomputing motion vectors for videos that have certain characteristicoptical flows, such as videos used to show a view moving along a paththrough a static environment. Several other embodiments may use a blockof pixels (e.g., an 8×8 block or a 16×16 block) to generate a motionvector.

To compute a motion vector, the movement of one or more pixels may beanalyzed across one or more frames of the original video sequence. Inseveral embodiments, the process analyzes movement of pixels between areference first frame (i.e., n) and a subsequent second frame (i.e.,n+1). The motion vectors indicate interpolated positions of pixels attimes between the first and second frames.

In some embodiments, the process analyzes movement of pixels between areference frame (i.e., n) and one or more subsequent frames (i.e., n,n+1, . . . n+r). Certain embodiments may analyze the movement of pixelsbetween a reference frame that has been included in the subset of framesand the subsequent frames in the original video until reaching the nextframe that is also included in the selected subset of frames. Forexample, if the subset of frames selected from the original videoincludes every 3rd frame (i.e., n, n+3, n+6, etc.) and the originalvideo has a total of z frames such that the total number of frames hasbeen reduced by z/3, then the process may compute for the referenceframe n, the movement of the pixels through two subsequent frames ofvideo (i.e., n+1, n+2) and will not analyze the n+3^(rd) frame sincethis is the next reference frame that has been included in the subset offrames and for which motion vectors may be calculated.

Several embodiments may also analyze the movement of pixels throughpreceding frames (i.e., n−1, n−2, etc.) relative to a reference frame(i.e., n). Based on the movement of pixels through the one or moreframes of video relative to the reference frame, the process may computemotion vectors that describe the direction and the movement of thepixels through the frames. Some embodiments may store the components ofthe motion vectors using Cartesian coordinates that specify a set of[x,y] coordinates in a two dimensional plane while other embodiments mayuse polar coordinates that specify a [distance, angle]. Otherembodiments may use different coordinate systems and/or values for themotion vectors.

The process stores (415) the motion vectors describing the movement ofpixels between at least a first frame in the subset of frames and atleast a second frame that is not in the subset of frames. In someembodiments, the motion vectors are stored within one of the referenceframes used in computing the motion vectors. Further details regardingprocesses for storing motion vectors within a frame are described indetail below with reference to FIG. 5. In other embodiments, the motionvectors may be stored separately of the pixel data of a frame of video.

The process decimates (420) the frames not included in the subset offrames from the original video sequence to generate a new videosequence. In some embodiments, the process generates a new videosequence by copying the subset of frames from the original videosequence into the new video sequence. In other embodiments, the processremoves frames from the sequence of frames in the original videosequence in order to generate the new video sequence. Other embodimentsmay decimate the frames from the original video sequence using othermechanisms.

In many embodiments, the process encodes (425) the new video, includingthe motion vectors, using standardized compression techniques (e.g.,MPEG, HEVC, etc.). As described above, many of the standardizedcompression techniques compute their own set of motion vectors in orderto encode inter frames. As some embodiments store the motion vectorswithin the frames of video, the subsequent encoding of these framesintroduces a different set of motion vectors that may be used to decodethe video. The motion vectors used to encode the inter frames aregenerally unavailable during playback of a video. A decoder typicallyreturns decoded frames and does not expose the motion vectors relatingthe pixels or blocks of pixels within the frames. Thus by embedding theadditional motion vectors within the frames, or separately, manyembodiments of the invention are able to interpolate frames during thereal-time playback of the video on a device using decoded frames outputby a decoder. The process then completes.

Although specific encoding processes are described above with referenceto FIG. 4, any of a variety of processes can be utilized to select asubset of video frames from encoding at a nominal frame rate andgenerate motion vectors for all of the frames discarded from theoriginal video sequence as appropriate to the requirements of specificapplications in accordance with embodiments of the invention. Processesfor embedding motion vectors for use in interpolation of additionalframes within the decoded frames of a video sequence in accordance withvarious embodiments of the invention are discussed further below.

Encoding Motion Vectors within Frames

As described above, some embodiments may encode the motion vectorswithin frames of video and these embedded motion vectors may be used tointerpolate deleted frames of an original video sequence and/oradditional frames and thus allow a playback device to vary the playbackspeed of the video while maintaining the viewing quality of the video.Some embodiments encode the motion vectors using color values of pixelsin a frame of video. The color values used to store components of themotion vectors may be a pixel's red, green, and/or blue color values inthe RGB color space, and/or a hue, saturation, and/or brightness valuewithin the HSB color model, among various other color models.

A process for encoding motion vectors within the video data of a decodedvideo frame in accordance with an embodiment of the invention isillustrated in FIG. 5. The process 500 generates (505) a set of motionvectors for a frame of video. Various processes for generating motionvectors are described above with reference to FIG. 4. As describedabove, the motion vectors may be defined using Cartesian coordinates orPolar coordinates. In the Cartesian coordinate system, a motion vectoris usually represented by an X and Y component. For example, if a pixelmoves 3 pixels to the right and 5.5 pixels upward, it can be representedby the vector (3, 5.5). In the polar coordinate system, a motion vectoris usually represented by a distance and an angle (e.g., 0-360 degrees).

The process encodes (510) each motion vector within a correspondingframe of video. In particular, for a particular motion vector, theprocess may encode the motion vector within the color information of thepixels of a frame. Pixels may have red, green, and blue color values(i.e., RGB). Some embodiments may store a motion vector's X component asa red color value and the Y component as a green color value. In the RGBcolor space, each R, G, B color component may have a range from 0 to 255and thus a motion vector may be encoded with a range of [0,0] to [255,255] pixels. Certain embodiments may shift the values to includenegative values (e.g., [−128,−128] to [127,127]) and thus provide for arange of movement within different quadrants of the x,y plane.Furthermore, some embodiments may provide a greater level of precisionby using a sub-pixel range of movement by dividing the range by thelevel of sub-pixel accuracy desired. For example, to obtain a 0.25 pixelmovement, the range of 128 may be divided by 4. This provides a greaterdegree of accuracy at the expense of a shorter range. In this example,the range would be decreased to [−32,−32] to [31.75, 31.75] pixels.Using a greater degree of accuracy may become problematic when pixels ina video are moving fast.

As described above, a problem with storing the motion vectors in the RGBvalues of a frame may arise when the frames are subsequently encodedusing standard compression techniques, which tend to heavily compressthe color information of a pixel yet preserve the luminance values ofpixels. Thus, storing the motion vectors using the RGB values may causethe motion vectors to be less accurate when the RGB values of the pixelsare subsequently decoded.

In order to both provide a greater level of precision for the motionvectors and to preserve the pixel values during a subsequent encoding,many embodiments store the motion vectors using polar coordinatesembedded using the hue and the brightness value of a pixel. Inparticular, these embodiments may use the HSB (hue, saturation,brightness) color model, with the hue value of a pixel corresponding toan angle of the motion vector and the brightness corresponding to thelength of the motion vector. Many embodiments use the hue and brightnessvalues of pixels to store the components of the motion vectors in orderto preserve the motion vector data and minimize the loss of data thatmay occur during a subsequent encoding of the video using a standardizedcompression mechanism. In particular, many compression techniquescompress video images based on how the human brain interprets images andnot based on mathematical accuracy. Thus, as noted above, luminance ispreserved well, but color information is generally heavily compressedsince the human brain is more sensitive to contrast rather than colordifferences. Thus, using hue and brightness values to store the motionvector data helps preserve this data during subsequent encoding anddecoding of the video frames.

The hue color model (or hue color spectrum) describes the range ofcolors along the color spectrum using angles. For example, in the huecolor model, 0 degrees represents red, 60 degrees represents yellow, andso forth for the entire color spectrum. Given that polar coordinates arealso defined using angles, the hue color model is well suited forstoring the angular component of motion vectors. In particular, a motionvector may be embedded by storing the angle of the motion vector withinthe hue value of a pixel and the length within the brightness of apixel. The brightness of a pixel may range from 0 to 255 and isgenerally highly accurate even after encoding since most standardizedcompression techniques preserve a pixel's brightness value with a higherdegree of precision than a pixel's color values. Thus for example, amotion vector that is pointing 127 units to the right may be representedby a pixel with a red hue value (i.e. 0 degrees) at half intensity(i.e., 127). As can be readily appreciated, using the hue and brightnessvalues of a pixel to store motion vectors allows the movement of pixelsto be described in a full 360 degree rotation at a full range of 255pixels in every direction. This provides a larger range compared to the[x,y] range that would otherwise be available using the RGB colorcomponents (i.e., [0,0] to [255, 255]), and even in situations where therange is lowered to allow for sub-pixel accuracy.

In some embodiments, the size of a frame is increased to store themotion vectors. In particular some embodiments may append rows of pixelsto the bottom portion of a frame that contain the values of motionvectors. For example, a 720×1280p video may add in an additional 30 rowsof pixels in order to add a total of 38,400 pixels (i.e., 30×1280)storing the values of motion vectors. Other embodiments may use a numberof rows of pixels appropriate to the requirements of specificapplications to store an appropriate number of motion vectors to enablethe interpolation of intermediate frames between encoded frames in adecoded video sequence. Where the number of motion vectors is smallerthan the number of pixels in a frame, motion vectors corresponding tospecific pixel locations can be encoded and/or the pixel locations ofthe motion vectors encoded with the motion vector data. In manyembodiments, a playback device may not display the portion of a framethat contains pixels corresponding to motion vectors. Other embodimentsmay store the motion vectors at different locations within a frame suchas the top, sides, in certain areas within the frame or within a headerand/or metadata corresponding to the video. Furthermore, as describedabove, other embodiments may store the motion vectors elsewhere within acontainer file, and/or in a separate file and provide references withinthe file to frames of video in the encoded video.

Although specific processes for encoding motion vectors are describedabove with respect to FIG. 5 any of a variety of processes can beutilized to encode motion vectors as appropriate to the requirements ofspecific applications in accordance with embodiments of the invention.

Decoding Frames and Using Motion Vectors to Interpolate IntermediateFrames

A playback device may request and receive an encoded video sequence withembedded motion vectors and use the motion vectors to playback the videoat a higher frame rate than the nominal frame rate of the encoded video.This is particularly useful in allowing a playback device to adjust theplayback speed of a video sequence while maintaining a frame rate thatprovides a smooth visual quality to the video. For example, the playbackspeed may be slowed to any particular speed and the playback device maybe able to generate new frames in real-time in order to maintain theframe rate of the video and avoid a stuttering effect that wouldotherwise become apparent without the additional interpolated frames. Aprocess for using motion vectors to interpolate frames from the decodedframes of an encoded video sequence in accordance with an embodiment ofthe invention is illustrated in FIG. 6.

The process 600 extracts (605) encoded frames from the encoded videosequence. The encoded video may be streamed to a playback device from adistribution server and the playback device may store the streamedframes of encoded video within a buffer.

The process provides (610) the encoded frames to a decoder for decoding.Based on the particular encoding standard used to encode the frames ofthe video, the process may use an appropriate standards based decoder.Typically, encoding parameters are provided with the encoded video andthe encoding parameters are utilized to configure the video decoder. Akey parameter for configuring the video decoder is the frame rate of thevideo sequence. As noted above, the frame rate of the encoded video isthe nominal frame rate or rate at which encoded frames are provided tothe video decoder. A media player application can playback video at ahigher frame rate by using the decoded frames output by the decoder forpresentation at the nominal frame rate and the motion vectors tointerpolate additional frames. The process next determines (615) whethera given playback speed of the video on the device requires interpolatingadditional frames. If the playback speed does not require interpolatingadditional frames, the process plays back (630) the decoded frames ofvideo. For example, if an encoded video is received with a frame rate of30 frames per second and the playback device is playing back the videoat a normal speed (i.e., without increasing or decreasing the playbackspeed), then the device can play back the decoded video at the sameframe rate of 30 frames per seconds. However, if the device adjusts theplayback speed to either increase or decrease the speed of the video,this may require interpolating additional frames in order to maintainthe smooth visual quality of the video. Note that increasing the speedmay not involve changing the frame rate at which video is played back.In many video games movement through a virtual world can involvedisplaying frames based upon motion. Therefore, slowing the speed ofmovement may involve interpolation in order to maintain the same framerate at the slower rate of motion. Accordingly, processes in accordancewith embodiments of the invention can interpolate frames to increaseframe rate and/or to maintain frame rate in circumstances whereadditional spacing in time is desired between playback of decodedframes.

When the process determines (615) that it needs additional frames duringplayback, the process extracts (620) motion vectors for interpolatingone or more additional frames using one or more decoded frames. Asdescribed above, motion vectors may be extracted during the real-timeplayback of a video sequence and used to interpolate additional frameswhen needed to provide a smooth viewing experience even when theplayback speed of the video is being adjusted. In order to allow for theinterpolation of frames during the real-time playback, some embodimentsstore motion vectors in a location that is readily available to theplayback device and thus can be quickly processed to interpolateadditional frames. In several embodiments, the motion vectors are storedas additional pixels within decoded frames. In these embodiments, theprocess is able to analyze the motion vectors embedded within aparticular decoded frame in order to interpolate additional frames. Someembodiments may extract the motion vectors from a separate file thataccompanies an encoded video sequence.

Other embodiments may encode the motion vectors within the pixel data ofthe decoded frames and a playback device may extract these motionvectors by analyzing the frame. In particular, the motion vectors may beextracted based on the color values of a subset of pixels within theframe. For example, the motion vectors may be embedded within pixels inthe last 30 rows of a frame for a video encoded at 720p or the last 50rows for a video encoded at 1080p. The specific number of rows/motionvectors typically depends upon the requirements of a specificapplication.

Furthermore, as described above, the motion vectors may be embedded inthe color values of the pixels, including one or more of the RGB values,the hue values, and/or the brightness values. In particular, manyembodiments store the angle component of a motion vector using a huevalue of a pixel and the length component of the motion vector using thebrightness value of the pixel. In embodiments that use hue values tostore the angle of a motion vector, these embodiments may use thegraphics processing unit (GPU) in order to facilitate the real timedecoding of the hue data. Several other embodiments may use look-uptables with a pixels' red, green and blue component values correspondingto an angle.

The process interpolates (620) additional frames using motion vectorsand the one or more decoded frames based on the particular playbackspeed at which the video is to be played. For example, if the encodedvideo is provided at a frame rate of 30 frames per second and theplayback device has specified a playback speed of the video at ½ thenormal speed, in order to maintain the visual quality of the video(i.e., by maintaining a frame rate of 30 frames per second), theplayback device would need to double the number of frames in the videosequence in order to maintain the same frame rate at ½ playback speed.In some embodiments, new frames are generated by starting with the pixeldata in the closest reference frame provided by the decoded video. Then,using the motion vectors extracted for the particular reference frame,some embodiments interpolate the new pixel location at the point in thenew frame.

In several embodiments, in order to interpolate a new frame, the processcomputes a value for each pixel in the new frame based on the pixelvalues of a reference frame and the motion vectors provided by thereference frame. However, in many embodiments, the number of motionvectors in a reference frame may be less than the number of pixels in aframe and thus in order to determine the movement of every pixel fromthe reference frame to a new frame, a number of embodiments applybilinear filtering to compute motion vectors for every pixel in theframe based upon the motion vectors that are specified. This isparticularly useful in video images where neighboring motion vectorsshare similar properties such as video of motion along a straight linethrough a static environment. For example, for a video taken from thevantage point of a person walking down a path in a forest, as the personmoves along the path, the neighboring pixels will move in a similarmanner relative to each other. Thus in this type of video, a smaller setof motion vectors may be stored and the values of pixels may beinterpolated using bilinear filtering.

The process displays (630) the decoded frames, including the referenceframe followed by the interpolated frames until the next decoded frame.The process determines (635) whether there are more encoded frames thatneed to be played back on the device. If there are more encoded frames,the process returns to (605). Otherwise, the process completes.

Specific processes for decoding video for playback on playback devicesare described above with reference to FIG. 6, however, any variety ofprocesses may be utilized for decoding frames of encoded video andinterpolating additional frames using the decoded frames as appropriateto the requirements of specific playback devices in accordance withembodiments of the invention.

Although the present invention has been described in certain specificaspects, many additional modifications and variations would be apparentto those skilled in the art. It is therefore to be understood that thepresent invention may be practiced otherwise than specificallydescribed. Thus, embodiments of the present invention should beconsidered in all respects as illustrative and not restrictive.Accordingly, the scope of the invention should be determined not by theembodiments illustrated, but by the appended claims and theirequivalents.

What is claimed is:
 1. A source encoder, comprising: a processor; memoryincluding an encoder application; where the encoder application directsthe processor to: select a subset of frames from a first video sequence;generate motion vectors describing frames from the first video sequencethat are not part of the selected subset of frames, wherein each motionvector describes movement between a frame in the subset of frames and aframe not included in the subset of frames; store the motion vectors;decimate frames not included in the subset of frames from the firstvideo sequence to generate a second video sequence having a nominalframe rate less than the frame rate of the first video sequence; andencode the second video sequence at the nominal frame rate.